Binary Tree-based Accuracy-keeping Clustering Using CDA for Very Fast Japanese Character Recognition
نویسندگان
چکیده
Real-time character recognition in video frames has been attracting great attention from developers since scene text recognition was recognized as a new field of Optical Character Recognition (OCR) applications. There are thousands of characters in some oriental languages such as Japanese and Chinese, and the character recognition takes much longer time in general compared with European languages. Speed-up of character recognition is crucial to develop software for mobile devices such as Smart Phones. This paper proposes a binary tree-based clustering technique with CDA (Canonical Discriminant Analysis) that can keep the accuracy as quite high as possible. The experimental results show that the character recognition using the proposed clustering technique is 10.2 times faster than the full linear matching at mere 0.04% accuracy drop. When the proposed method is combined with the Sequential Similarity Detection Algorithm (SSDA), we can achieve 12.3 times faster character matching at exactly the same accuracy drop.
منابع مشابه
Facial expression recognition based on Local Binary Patterns
Classical LBP such as complexity and high dimensions of feature vectors that make it necessary to apply dimension reduction processes. In this paper, we introduce an improved LBP algorithm to solve these problems that utilizes Fast PCA algorithm for reduction of vector dimensions of extracted features. In other words, proffer method (Fast PCA+LBP) is an improved LBP algorithm that is extracted ...
متن کاملMulti-class Classification Using Support Vector Machines in Binary Tree Architecture
This paper presents architecture of Support Vector Machine classifiers arranged in a binary tree structure for solving multi-class classification problems with increased efficiency. The proposed SVM based Binary Tree Architecture (SVM-BTA) takes advantage of both the efficient computation of the tree architecture and the high classification accuracy of SVMs. Clustering algorithm is used to conv...
متن کاملCOGNITUS - Fast and Reliable Recognition of Handwritten Forms Based on Vector Quantisation
We report on an eecient intelligent character recognition tool for the automatic treatment of handwritten bank transfer forms. The classiication is based on nearest-neighbor algorithms and a novel binary clustering technique for the generation of large prototype sets. We introduce a new conndence measure which can be used on a decision tree structure to combine lowest error rates with a very hi...
متن کاملUse of Mutual Information Based Character Clusters in Dictionary-less Morphological Analysis of Japanese
For languages whose character set is very large and whose orthography does not require spacing between words, such as Japanese, tokenizing and part-of-speech tagging are often the difficult parts of any morphological analysis. For practical systems to tackle this problem, uncontrolled heuristics are primarily used. The use of information on character sorts, however, mitigates this difficulty. T...
متن کاملFont Recognition Using Shape-Based Quad-tree and Kd-tree Decomposition
The search for appropriate data representations and visual features for content-based image retrieval continues within the computer vision community, alongside the development of new matching and indexing techniques to facilitate fast search in large-scale image databases. In this study, we present a solution to the problem of typeface identification and character recognition in text-based imag...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011